10 research outputs found
Lower Bounds for Two-Sample Structural Change Detection in Ising and Gaussian Models
The change detection problem is to determine if the Markov network structures
of two Markov random fields differ from one another given two sets of samples
drawn from the respective underlying distributions. We study the trade-off
between the sample sizes and the reliability of change detection, measured as a
minimax risk, for the important cases of the Ising models and the Gaussian
Markov random fields restricted to the models which have network structures
with nodes and degree at most , and obtain information-theoretic lower
bounds for reliable change detection over these models. We show that for the
Ising model, samples are
required from each dataset to detect even the sparsest possible changes, and
that for the Gaussian, samples are
required from each dataset to detect change, where is the smallest
ratio of off-diagonal to diagonal terms in the precision matrices of the
distributions. These bounds are compared to the corresponding results in
structure learning, and closely match them under mild conditions on the model
parameters. Thus, our change detection bounds inherit partial tightness from
the structure learning schemes in previous literature, demonstrating that in
certain parameter regimes, the naive structure learning based approach to
change detection is minimax optimal up to constant factors.Comment: Presented at the 55th Annual Allerton Conference on Communication,
Control, and Computing, Oct. 201
Two studies in resource-efficient inference: structural testing of networks, and selective classification
Inference systems suffer costs arising from information acquisition, and from communication and computational costs of executing complex models. This dissertation proposes, in two distinct themes, systems-level methods to reduce these costs without affecting the accuracy of inference by using ancillary low-cost methods to cheaply address most queries, while only using resource-heavy methods on 'difficult' instances.
The first theme concerns testing methods in structural inference of networks and graphical models, the proposal being that one first cheaply tests whether the structure underlying a dataset differs from a reference structure, and only estimates the new structure if this difference is large. This study focuses on theoretically establishing separations between the costs of testing and learning to determine when a strategy such as the above has benefits. For two canonical models---the Ising model, and the stochastic block model---fundamental limits are derived on the costs of one- and two-sample goodness-of-fit tests by determining information-theoretic lower bounds, and developing matching tests. A biphasic behaviour in the costs of testing is demonstrated: there is a critical size scale such that detection of differences smaller than this size is nearly as expensive as recovering the structure, while detection of larger differences has vanishing costs relative to recovery.
The second theme concerns using Selective classification (SC), or classification with an option to abstain, to control inference-time costs in the machine learning framework. The proposal is to learn a low-complexity selective classifier that only abstains on hard instances, and to execute more expensive methods upon abstention. Herein, a novel SC formulation with a focus on high-accuracy is developed, and used to obtain both theoretical characterisations, and a scheme for learning selective classifiers based on optimising a collection of class-wise decoupled one-sided risks. This scheme attains strong empirical performance, and admits efficient implementation, leading to an effective SC methodology. Finally, SC is studied in the online learning setting with feedback only provided upon abstention, modelling the practical lack of reliable labels without expensive feature collection, and a Pareto-optimal low-error scheme is described
Doubly-Optimistic Play for Safe Linear Bandits
The safe linear bandit problem (SLB) is an online approach to linear
programming with unknown objective and unknown round-wise constraints, under
stochastic bandit feedback of rewards and safety risks of actions. We study
aggressive \emph{doubly-optimistic play} in SLBs, and their role in avoiding
the strong assumptions and poor efficacy associated with extant
pessimistic-optimistic solutions.
We first elucidate an inherent hardness in SLBs due the lack of knowledge of
constraints: there exist `easy' instances, for which suboptimal extreme points
have large `gaps', but on which SLB methods must still incur
regret and safety violations due to an inability to refine the location of
optimal actions to arbitrary precision. In a positive direction, we propose and
analyse a doubly-optimistic confidence-bound based strategy for the safe linear
bandit problem, DOSLB, which exploits supreme optimism by using optimistic
estimates of both reward and safety risks to select actions. Using a novel dual
analysis, we show that despite the lack of knowledge of constraints, DOSLB
rarely takes overly risky actions, and obtains tight instance-dependent
bounds on both efficacy regret and net safety violations up to
any finite precision, thus yielding large efficacy gains at a small safety cost
and without strong assumptions. Concretely, we argue that algorithm activates
noisy versions of an `optimal' set of constraints at each round, and activation
of suboptimal sets of constraints is limited by the larger of a safety and
efficacy gap we define.Comment: v2: extensive rewrite, with a much cleaner exposition of the theory,
and improvements in key definition
Strategies for Safe Multi-Armed Bandits with Logarithmic Regret and Risk
We investigate a natural but surprisingly unstudied approach to the
multi-armed bandit problem under safety risk constraints. Each arm is
associated with an unknown law on safety risks and rewards, and the learner's
goal is to maximise reward whilst not playing unsafe arms, as determined by a
given threshold on the mean risk.
We formulate a pseudo-regret for this setting that enforces this safety
constraint in a per-round way by softly penalising any violation, regardless of
the gain in reward due to the same. This has practical relevance to scenarios
such as clinical trials, where one must maintain safety for each round rather
than in an aggregated sense.
We describe doubly optimistic strategies for this scenario, which maintain
optimistic indices for both safety risk and reward. We show that schema based
on both frequentist and Bayesian indices satisfy tight gap-dependent
logarithmic regret bounds, and further that these play unsafe arms only
logarithmically many times in total. This theoretical analysis is complemented
by simulation studies demonstrating the effectiveness of the proposed schema,
and probing the domains in which their use is appropriate
Limits on testing structural changes in Ising models
We present novel information-theoretic limits on detecting sparse changes in Ising models, a problem that arises in many applications where network changes can occur due to some external stimuli. We show that the sample complexity for detecting sparse changes, in a minimax sense, is no better than learning the entire model even in settings with local sparsity. This is a surprising fact in light of prior work rooted in sparse recovery methods, which suggest that sample complexity in this context scales only with the number of network changes. To shed light on when change detection is easier than structured learning, we consider testing of edge deletion in forest-structured graphs, and high-temperature ferromagnets as case studies. We show for these that testing of small changes is similarly hard, but testing of \emph{large} changes is well-separated from structure learning. These results imply that testing of graphical models may not be amenable to concepts such as restricted strong convexity leveraged for sparsity pattern recovery, and algorithm development instead should be directed towards detection of large changes.https://arxiv.org/abs/2011.0367
Piecewise linear regression via a difference of convex functions
We present a new piecewise linear regression
methodology that utilizes fitting a difference of
convex functions (DC functions) to the data.
These are functions f that may be represented
as the difference _1- _2 for a choice of convex functions _1,_2. The method proceeds by estimating piecewise-liner convex functions, in a manner similar to max-affine regression, whose difference approximates the data. The choice of the function is regularised by a new seminorm over the class of DC functions that controls the _∞ Lipschitz constant of the estimate.
The resulting methodology can be efficiently implemented via Quadratic programming even in high dimensions, and is shown to have close to minimax statistical risk. We empirically validate the method, showing it to be practically implementable, and to have comparable performance to existing egression/classification methods on real-world datasets.http://proceedings.mlr.press/v119/siahkamari20a/siahkamari20a.pdfPublished versio